Same news is good news: automatically collecting reoccurring radio news stories
نویسندگان
چکیده
We present methods for finding same or almost same news stories in the hourly radio news broadcasts. Our procedures are able to detect reoccuring news stories of subsequent news broadcasts spoken by the same or different announcers only from the speech signal. They allow to establish a large database of repeated and professionally read speech at low costs that is especially interesting for prosody research, but also, e.g., for concept-to-speech and socio-linguistic studies. An automatically recorded complete radio news broadcast is first segmented into individual news stories using HMM recognition. Then, the word sequence estimates of the stories are either compared directly (naive method) or realigned with the signal of other stories (realignment method) in order to find out which stories were read before and which not. Both methods can be further improved by computing “meta distances” that also take into account distances to other stories. We evaluate and compare the usefulness of the proposed methods on real life data. We find that the realignment method combined with meta distances is the most reliable of the methods and that it is well suited for the task.
منابع مشابه
Radio : Content Filtering and Delivery for Broadcast Audio
Synthetic News Radio uses automatic speech recognition and clustered text news stories to automatically find story boundaries in an audio news broadcast, and it creates semantic representations that can match stories of similar content through audio-based queries. Current speech recognition technology cannot by itself produce enough information to accurately characterize news audio; therefore, ...
متن کاملThe Físchlár-News-Stories System: Personalised Access to an Archive of TV News
The “Físchlár” systems are a family of tools for capturing, analysis, indexing, browsing, searching and summarisation of digital video information. Físchlár-News-Stories, described in this paper, is one of those systems, and provides access to a growing archive of broadcast TV news. Físchlár-News-Stories has several notable features including the fact that it automatically records TV news and s...
متن کاملNewsViz: Emotional Visualization of News Stories
The NewsViz system aims to enhance news reading experiences by integrating 30 seconds long Flash-animations into news article web pages depicting their content and emotional aspects. NewsViz interprets football match news texts automatically and creates abstract 2D visualizations. The user interface enables animators to further refine the animations. Here, we focus on the emotion extraction com...
متن کاملImplications of News Segments and Movies for Enhancing Listening Comprehension of Language Learners
Abstract Armed with technological development, the present study aimed at gauging the effectiveness of exposure to news and movies as two types of audiovisual programs in improving language learners’ listening comprehension at the intermediate level. To this end, a listening comprehension test was administered to 108 language learners and finally 60 language learners were selected as intermedia...
متن کاملBrowsing System
In this demo, we present a system we have developed for automatic broadcast-quality video indexing that successfully combines results from the fields of speaker verification, acoustic analysis, very large vocabulary caption character recognition, content based sampling of video, information retrieval, dialogue systems, and ASF media delivery over IP. The prototype system of this demo is availab...
متن کامل